compositional_distributional_semanticsfandomcom-20200214-history
Muliple word prototypes and disambiguation
More than one types of vectors * Type vectors: Represent a word as a summary of all its occurrences, across all of its senses. * Token vectors: Represent the meaning of an occurrence of a word. * Sense vectors(?): Represent a sense of a word as a summary of all its occurrences. Sense/synset vectors Motivations: * "We note that the collision of word senses often hinders performance on the behavioral data from Mitchell and Lapata (2010)." (Fyshe et al. 2015) Fyshe, A., Wehbe, L., Talukdar, P., Murphy, B., & Mitchell, T. A Compositional and Interpretable Semantic Space. TODO: Pelevina et al. (2016)Pelevina, M., Arefiev, N., Biemann, C., & Panchenko, A. (2016). Making Sense of Word Embeddings. In Proceedings of the 1st Workshop on Representation Learning for NLP (pp. 174–183). Berlin, Germany: Association for Computational Linguistics. PDF Models of word meaning in context Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1 (pp. 873–882). A summary: Dinu (2012)Dinu, G., Thater, S., & Laue, S. (2012, June). A comparison of models of word meaning in context. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 611-615). Association for Computational Linguistics. Type-based models * 'Erk 2008'Erk, K., & Padó, S. (2008, October). A structured vector space model for word meaning in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 897-906). Association for Computational Linguistics.: combining the type vector with the inverse selectional preference vector of context ©. This is simply the centroid of the vectors of all words that take the context as direct object ®: v(w,r,c) = \left( \frac{1}{n}\sum_{w'}f(w',r,c) \cdot \vec{w'} \right) \times \vec{w} * 'Thater 2010'Thater, S., Fürstenau, H., & Pinkal, M. (2010, July). Contextualizing semantic representations using syntactically enriched vector models. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp. 948-957). Association for Computational Linguistics.: second order vectors as basic representation for the target word * 'Thater 2011'Thater, S., Fürstenau, H., & Pinkal, M. (2011). Word Meaning in Context: A Simple and Effective Vector Model. In IJCNLP (pp. 1134-1143).: re-weight the vector components of the target word, based on distribu- tional similarity with the context word Token-based models Latent variable models Joint model of sense learning and disambiguating Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space (2014) TODO From "Vector Space Models of Lexical Meaning", Stephen Clark: Another issue that has been glossed over is that of word sense. Curran (2004) conflates the various senses of a word, so that the gold standard syn- onyms for company, for example, against which the output of his automatic system are compared, include synonyms for the companionship sense of com- pany, the armed forces sense, the gathering sense, and so on. One piece of work in distributional semantics which explicitly models senses is Schu ̈tze (1998). Schu ̈tze uses a two-stage clustering method in which the first stage derives word vectors using the methods described in Section 3 (specifically using the window method with a window size of 50). Then, for a particular target word in context, each of the word vectors for the context words are added together (and divided by the number of context words) to derive a centroid for the particular context. The effect of deriving this second-order context vector is that the contextual words will act as word sense disambiguators, and when added together will emphasise the basis vectors pertaining to the particular sense. Schu ̈tze gives the example of suit. Consider an instance of suit in a clothing, rather than legal, context, so that it is surrounded by words such as tie, jacket, wear, and so on. Of course these context words are potentially ambiguous as well, but the effect of adding them together is to emphasise those basis vectors which most of the context words have in common, which in this case is basis vectors relating to clothes. Li & Jurafsky (2015)Li, J., & Jurafsky, D. (2015). Do multi-sense embeddings improve natural language understanding? arXiv Preprint arXiv:1506.01070.: "We find that multi-sense embeddings do improve performance on some tasks (part-of-speech tagging, semantic relation identification, semantic relatedness) but not on others (named entity recognition, various forms of sentiment analysis)." References Category:Distributional semantic model